Search CORE

272 research outputs found

Diffusion-Based Audio Inpainting

Author: Moliner Eloi
Välimäki Vesa
Publication venue
Publication date: 24/05/2023
Field of study

Audio inpainting aims to reconstruct missing segments in corrupted recordings. Previous methods produce plausible reconstructions when the gap length is shorter than about 100\;ms, but the quality decreases for longer gaps. This paper explores recent advancements in deep learning and, particularly, diffusion models, for the task of audio inpainting. The proposed method uses an unconditionally trained generative model, which can be conditioned in a zero-shot fashion for audio inpainting, offering high flexibility to regenerate gaps of arbitrary length. An improved deep neural network architecture based on the constant-Q transform, which allows the model to exploit pitch-equivariant symmetries in audio, is also presented. The performance of the proposed algorithm is evaluated through objective and subjective metrics for the task of reconstructing short to mid-sized gaps. The results of a formal listening test show that the proposed method delivers a comparable performance against state-of-the-art for short gaps, while retaining a good audio quality and outperforming the baselines for the longest gap lengths tested, 150\;ms and 200\;ms. This work helps improve the restoration of sound recordings having fairly long local disturbances or dropouts, which must be reconstructed.Comment: Submitted for publication to the Journal of Audio Engineering Society on January 30th, 202

arXiv.org e-Print Archive

Efficient target-response interpolation for a graphic equalizer

Author: BELLOCH JOSE A.
Välimäki Vesa
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2016
Field of study

Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, held in Shanghai (China) during 20-25 March 2016.A graphic equalizer is an adjustable filter in which the command gain of each frequency band is practically independent of the gains of other bands. Designing a graphic equalizer with a high precision requires evaluating a target response that interpolates the magnitude response at several frequency points between the command gains. Good accuracy has been previously achieved by using polynomial interpolation methods such as cubic Hermite or spline interpolation. However, these methods require large computational resources, which is a limitation in real-time applications. This paper proposes an efficient way of computing the target response without sacrificing the approximation accuracy. This new approach called Linear Interpolation with Constant Segments (LICS) reduces the computing time of the target response by 55% and has an intrinsic parallel structure. Performance of the LICS method is assessed on an ARM Cortex-A7 core, which is commonly used in embedded systems.This work was conducted in spring 2015 when the first author was a visiting postdoctoral researcher at Aalto University. This research has been partly funded by the TIN2014-53495-R and TIN2011-23283 projects of the Ministerio de Economía y Competitividad and FEDER

Crossref

Repositori Institucional de la Universitat Jaume I

Huilun äänen synteesi laskennallisen mallin avulla

Author: Välimäki Vesa
Publication venue
Publication date: 01/01/1992
Field of study

Aaltodoc Publication Archive

EQ

Author: Liski Juho
Rämö Jussi
Välimäki Vesa
Publication venue: Akustinen Seura ry
Publication date: 28/10/2019
Field of study

Ekvalisointia käytetään akustiikassa ja audiotekniikassa laajasti esimerkiksi äänentoistojärjestelmän taajuusvasteen korjaamiseen. Ekvalisaattorien (EQ) suunnittelu on kehittynyt paljon viime vuosina. Tässä artikkelissa keskitymme graaﬁsiin ekvalisaattoreihin, joiden suunnittelu on haastavaa. Esittelemme kaksi periaatetta ekvalisaattorin toteuttamiseen, peräkkäis- ja rinnaisrakenteen. Kehittämämme uusimmat graaﬁset ekvalisaattorit täyttävät kriittisen hiﬁ-vaatimuksen, jonka mukaan taajuusvasteen tulee vastata asetuksia yhden desibelin tarkkuudella. Graaﬁsen ekvalisaattorin peräkkäisrakenteessa se onnistuu valitsemalla tarkoituksenmukainen parametrinen suodin jokaiselle kaistalle, säätämällä niiden kaistanleveys siten, että vahvistuksen vaikutus viereisille kaistoille tunnetaan, ja ratkaisemalla kaistasuotimien vahvistukset pienimmän neliösumman menetelmällä. Tarkka ja tehokas rinnakkainen graaﬁnen ekvalisaattori saadaan muuntamalla peräkkäisrakenne viivästettyyn rinnakkaismuotoon, joka on uutuus tällä alalla.Koska näillä menetelmillä suunniteltujen oktaavi- ja terssiekvalisaattorien parametrien päivitys vaatii paljon laskentaa, olemme korvanneet vahvistusten optimoinnin keinotekoisen hermoverkon avulla. Kehittämiemme menetelmien ansiosta graaﬁsen oktaavi- ja terssiekvalisaattorin suunnitteluongelma on nyt käytännössä ratkaistu.Non peer reviewe

Aaltodoc Publication Archive

Solving Audio Inverse Problems with a Diffusion Model

Author: Lehtinen Jaakko
Moliner Eloi
Välimäki Vesa
Publication venue
Publication date: 08/11/2022
Field of study

This paper presents CQT-Diff, a data-driven generative audio model that can, once trained, be used for solving various different audio inverse problems in a problem-agnostic setting. CQT-Diff is a neural diffusion model with an architecture that is carefully constructed to exploit pitch-equivariant symmetries in music. This is achieved by preconditioning the model with an invertible Constant-Q Transform (CQT), whose logarithmically-spaced frequency axis represents pitch equivariance as translation equivariance. The proposed method is evaluated with objective and subjective metrics in three different and varied tasks: audio bandwidth extension, inpainting, and declipping. The results show that CQT-Diff outperforms the compared baselines and ablations in audio bandwidth extension and, without retraining, delivers competitive performance against modern baselines in audio inpainting and declipping. This work represents the first diffusion-based general framework for solving inverse problems in audio processing.Comment: Submitted to ICASSP 202

arXiv.org e-Print Archive

Aaltodoc Publication Archive

Real-time emulation of the Clavinet

Author: Leonardo Gabrielli
Stefan Bilbao
Vesa Välimäki
Publication venue
Publication date: 01/01/2011
Field of study

none3siopenLeonardo Gabrielli, Vesa Välimäki, Stefan BilbaoGabrielli, Leonardo; Välimäki, Vesa; Bilbao, Stefa

University of Michigan Library Repository

IRIS UniversitÃ Politecnica delle Marche

Zero-Shot Blind Audio Bandwidth Extension

Author: Elvander Filip
Moliner Eloi
Välimäki Vesa
Publication venue
Publication date: 02/06/2023
Field of study

Audio bandwidth extension involves the realistic reconstruction of high-frequency spectra from bandlimited observations. In cases where the lowpass degradation is unknown, such as in restoring historical audio recordings, this becomes a blind problem. This paper introduces a novel method called BABE (Blind Audio Bandwidth Extension) that addresses the blind problem in a zero-shot setting, leveraging the generative priors of a pre-trained unconditional diffusion model. During the inference process, BABE utilizes a generalized version of diffusion posterior sampling, where the degradation operator is unknown but parametrized and inferred iteratively. The performance of the proposed method is evaluated using objective and subjective metrics, and the results show that BABE surpasses state-of-the-art blind bandwidth extension baselines and achieves competitive performance compared to non-blind filter-informed methods when tested with synthetic data. Moreover, BABE exhibits robust generalization capabilities when enhancing real historical recordings, effectively reconstructing the missing high-frequency content while maintaining coherence with the original recording. Subjective preference tests confirm that BABE significantly improves the audio quality of historical music recordings. Examples of historical recordings restored with the proposed method are available on the companion webpage: (http://research.spa.aalto.fi/publications/papers/ieee-taslp-babe/)Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processin

arXiv.org e-Print Archive

Adversarial Guitar Amplifier Modelling With Unpaired Data

Author: Juvela Lauri
Välimäki Vesa
Wright Alec
Publication venue
Publication date: 02/11/2022
Field of study

We propose an audio effects processing framework that learns to emulate a target electric guitar tone from a recording. We train a deep neural network using an adversarial approach, with the goal of transforming the timbre of a guitar, into the timbre of another guitar after audio effects processing has been applied, for example, by a guitar amplifier. The model training requires no paired data, and the resulting model emulates the target timbre well whilst being capable of real-time processing on a modern personal computer. To verify our approach we present two experiments, one which carries out unpaired training using paired data, allowing us to monitor training via objective metrics, and another that uses fully unpaired data, corresponding to a realistic scenario where a user wants to emulate a guitar timbre only using audio data from a recording. Our listening test results confirm that the models are perceptually convincing

arXiv.org e-Print Archive

Aaltodoc Publication Archive

Five Variations on a Feedback Theme

Author: Kleimola Jari
Lazzarini Victor
Timoney Joseph
Välimäki Vesa
Publication venue: Dept. of Electronic Engineering, Queen Mary Univ. of London,
Publication date: 01/09/2009
Field of study

This is a study on a set of feedback amplitude modulation oscillator equations. It is based on a very simple and inexpensive algorithm which is capable of generating a complex spectrum from a sinusoidal input. We examine the original and five variations on it, discussing the details of each synthesis method. These include the addition of extra delay terms, waveshaping of the feedback signal, further heterodyning and increasing the loop delay. In complement, we provide a software implementation of these algorithms as a practical example of their application and as demonstration of their potential for synthesis instrument design

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive